Skip to content

Add hosted Defuddle URL conversion to the app#29

Merged
imadreamerboy merged 3 commits intomainfrom
defuddle
Mar 28, 2026
Merged

Add hosted Defuddle URL conversion to the app#29
imadreamerboy merged 3 commits intomainfrom
defuddle

Conversation

@imadreamerboy
Copy link
Copy Markdown
Owner

Summary

  • add URL inputs to the home flow so users can queue websites alongside files
  • convert website URLs through the hosted Defuddle API and surface rate-limit/service errors clearly
  • add shared input-source helpers for URL detection, display names, and output filenames
  • document website conversion in the README and Help FAQ, including free-tier limits and service behavior
  • extend tests for Defuddle conversion, input-source helpers, and new translation keys

Testing

  • .\.venv-win\Scripts\python.exe -m pytest -q

@imadreamerboy imadreamerboy marked this pull request as ready for review March 28, 2026 12:32
@imadreamerboy imadreamerboy requested a review from Copilot March 28, 2026 12:32
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds first-class website URL inputs to the home flow and routes URL conversions through the hosted Defuddle service, with supporting helpers, UI, translations, docs, and tests.

Changes:

  • Add URL input bars to the empty and queue states and allow queueing URLs alongside files.
  • Add Defuddle-backed URL conversion path in the conversion pipeline and surface Defuddle backend usage in summaries.
  • Introduce shared input-source helpers (URL detection, display name, output filename stem) and update docs/translations/tests.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/utils/test_translations.py Extends translation-key coverage for new URL/Defuddle/help strings.
tests/core/test_input_sources.py Adds unit tests for URL detection, display names, and filename stem sanitization.
tests/core/test_conversion.py Adds tests for Defuddle URL conversion behavior and error surfacing.
markitdowngui/utils/translations.py Adds new UI/help strings and updates copy to reflect URL inputs.
markitdowngui/ui/home_interface.py Adds URL input UI, queues URL “sources”, and updates progress/result labeling for URLs.
markitdowngui/ui/help_interface.py Adds Defuddle docs link and new FAQ entries for URL conversion.
markitdowngui/ui/components/url_input_bar.py New reusable URL input widget used by the home flow.
markitdowngui/core/input_sources.py New helpers for URL detection, display names, and output filename stems.
markitdowngui/core/conversion.py Adds Defuddle backend + HTTP request path for URL conversions.
README.md Documents website URL conversion behavior, dependencies, and free-tier limits.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

candidate = value.strip()
if not candidate:
return False

Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_web_url() is used as the primary validation gate for URL input, but it currently returns True for strings that contain whitespace in the URL path (e.g., "https://example.com/hello world"). Those inputs will later fail during HTTP request construction. Consider tightening validation (reject any whitespace / control chars) or normalizing to a properly encoded URL before enqueueing so the UI’s "Invalid URL" path catches these cases.

Suggested change
# Reject URLs containing any whitespace or control characters, as these
# will cause failures when constructing HTTP requests unless encoded.
if any(ch.isspace() or ord(ch) < 32 for ch in candidate):
return False

Copilot uses AI. Check for mistakes.

assert outcome.markdown == "# Article"
assert outcome.backend == conversion.BACKEND_DEFUDDLE
assert captured["url"] == "https://defuddle.md/https://example.com/article"
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These assertions lock in the exact Defuddle request URL string. If the implementation is updated to percent-encode/normalize embedded target URLs (to handle spaces, fragments, etc.), this test will need to assert the normalized/encoded request URL instead of the raw concatenation.

Suggested change
assert captured["url"] == "https://defuddle.md/https://example.com/article"
expected_url = conversion._build_defuddle_request_url("https://example.com/article")
assert captured["url"] == expected_url

Copilot uses AI. Check for mistakes.
Comment on lines +324 to +325
def _build_defuddle_request_url(url: str) -> str:
return f"{DEFUDDLE_API_BASE_URL}{url.strip()}"
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_build_defuddle_request_url() concatenates the raw user-provided URL into the Defuddle request URL. If the pasted URL contains characters that must be percent-encoded in a URL (common case: spaces), requests.get() will raise an InvalidURL/RequestException and the app will report it as a service reachability problem. Consider normalizing/encoding the embedded URL (e.g., percent-encode the target URL before appending) so the request URL is always syntactically valid, and so reserved characters like # aren’t silently treated as fragments by the client.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6310467ddd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".



def _build_defuddle_request_url(url: str) -> str:
return f"{DEFUDDLE_API_BASE_URL}{url.strip()}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Percent-encode URL before building Defuddle request path

_build_defuddle_request_url appends the raw URL string directly to https://defuddle.md/, so any query string is sent as the outer request query (for example .../https://site/page?key=abc). Defuddle reserves key in the outer query for API authentication, so normal target URLs that legitimately contain ?key=... are treated as auth attempts and can fail with 401 instead of being converted. Encode the full target URL before appending (or send it via a dedicated parameter) so target query params are not reinterpreted as Defuddle control params.

Useful? React with 👍 / 👎.

@imadreamerboy imadreamerboy merged commit 27a3731 into main Mar 28, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants